可以利用致辞知识来识别文本中的因果关系。在这项工作中,我们在Atomic2020中言语三元组,广泛的覆盖率致辞推理知识图表,到自然语言文本,并不断预先预留伯特普瑞赖林模型。我们评估了回答勤杂朗语言推理问题所产生的模型。我们的研究结果表明,通过致致通知推理知识增强了不断预付费的语言模型在两个致辞语言推理基准测试,COPA和BCOPA-CE上表现出我们的基线,而无需对基础模型的额外改进或使用质量增强的数据进行微调。
translated by 谷歌翻译
Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based Neural Architecture Search (NAS) method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. To this end, we introduce the Pseudo-Inverted Bottleneck conv block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower GMACs and parameter count, GradCAM comparisons show that our network is able to better detect distinctive features of target objects compared to DARTS.
translated by 谷歌翻译
Recent advances in language modeling have enabled new conversational systems. In particular, it is often desirable for people to make choices among specified options when using such systems. We address the problem of reference resolution, when people use natural expressions to choose between real world entities. For example, given the choice `Should we make a Simnel cake or a Pandan cake?' a natural response from a non-expert may be indirect: `let's make the green one'. Reference resolution has been little studied with natural expressions, thus robustly understanding such language has large potential for improving naturalness in dialog, recommendation, and search systems. We create AltEntities (Alternative Entities), a new public dataset of entity pairs and utterances, and develop models for the disambiguation problem. Consisting of 42K indirect referring expressions across three domains, it enables for the first time the study of how large language models can be adapted to this task. We find they achieve 82%-87% accuracy in realistic settings, which while reasonable also invites further advances.
translated by 谷歌翻译
Recent advances in artificial intelligence (AI) have significantly intensified research in the geoscience and remote sensing (RS) field. AI algorithms, especially deep learning-based ones, have been developed and applied widely to RS data analysis. The successful application of AI covers almost all aspects of Earth observation (EO) missions, from low-level vision tasks like super-resolution, denoising, and inpainting, to high-level vision tasks like scene classification, object detection, and semantic segmentation. While AI techniques enable researchers to observe and understand the Earth more accurately, the vulnerability and uncertainty of AI models deserve further attention, considering that many geoscience and RS tasks are highly safety-critical. This paper reviews the current development of AI security in the geoscience and RS field, covering the following five important aspects: adversarial attack, backdoor attack, federated learning, uncertainty, and explainability. Moreover, the potential opportunities and trends are discussed to provide insights for future research. To the best of the authors' knowledge, this paper is the first attempt to provide a systematic review of AI security-related research in the geoscience and RS community. Available code and datasets are also listed in the paper to move this vibrant field of research forward.
translated by 谷歌翻译
The problem of reversing the compilation process, decompilation, is an important tool in reverse engineering of computer software. Recently, researchers have proposed using techniques from neural machine translation to automate the process in decompilation. Although such techniques hold the promise of targeting a wider range of source and assembly languages, to date they have primarily targeted C code. In this paper we argue that existing neural decompilers have achieved higher accuracy at the cost of requiring language-specific domain knowledge such as tokenizers and parsers to build an abstract syntax tree (AST) for the source language, which increases the overhead of supporting new languages. We explore a different tradeoff that, to the extent possible, treats the assembly and source languages as plain text, and show that this allows us to build a decompiler that is easily retargetable to new languages. We evaluate our prototype decompiler, Beyond The C (BTC), on Go, Fortran, OCaml, and C, and examine the impact of parameters such as tokenization and training data selection on the quality of decompilation, finding that it achieves comparable decompilation results to prior work in neural decompilation with significantly less domain knowledge. We will release our training data, trained decompilation models, and code to help encourage future research into language-agnostic decompilation.
translated by 谷歌翻译
Numerous models have tried to effectively embed knowledge graphs in low dimensions. Among the state-of-the-art methods, Graph Neural Network (GNN) models provide structure-aware representations of knowledge graphs. However, they often utilize the information of relations and their interactions with entities inefficiently. Moreover, most state-of-the-art knowledge graph embedding models suffer from scalability issues because of assigning high-dimensional embeddings to entities and relations. To address the above limitations, we propose a scalable general knowledge graph encoder that adaptively involves a powerful tensor decomposition method in the aggregation function of RGCN, a well-known relational GNN model. Specifically, the parameters of a low-rank core projection tensor, used to transform neighborhood entities in the encoder, are shared across relations to benefit from multi-task learning and incorporate relations information effectively. Besides, we propose a low-rank estimation of the core tensor using CP decomposition to compress the model, which is also applicable, as a regularization method, to other similar linear models. We evaluated our model on knowledge graph completion as a common downstream task. We train our model for using a new loss function based on contrastive learning, which relieves the training limitation of the 1-N method on huge graphs. We improved RGCN performance on FB15-237 by 0.42% with considerably lower dimensionality of embeddings.
translated by 谷歌翻译
Gaussian Mixture Models (GMM) are one of the most potent parametric density estimators based on the kernel model that finds application in many scientific domains. In recent years, with the dramatic enlargement of data sources, typical machine learning algorithms, e.g. Expectation Maximization (EM), encounters difficulty with high-dimensional and streaming data. Moreover, complicated densities often demand a large number of Gaussian components. This paper proposes a fast online parameter estimation algorithm for GMM by using first-order stochastic optimization. This approach provides a framework to cope with the challenges of GMM when faced with high-dimensional streaming data and complex densities by leveraging the flexibly-tied factorization of the covariance matrix. A new stochastic Manifold optimization algorithm that preserves the orthogonality is introduced and used along with the well-known Euclidean space numerical optimization. Numerous empirical results on both synthetic and real datasets justify the effectiveness of our proposed stochastic method over EM-based methods in the sense of better-converged maximum for likelihood function, fewer number of needed epochs for convergence, and less time consumption per epoch.
translated by 谷歌翻译
The process of screening molecules for desirable properties is a key step in several applications, ranging from drug discovery to material design. During the process of drug discovery specifically, protein-ligand docking, or chemical docking, is a standard in-silico scoring technique that estimates the binding affinity of molecules with a specific protein target. Recently, however, as the number of virtual molecules available to test has rapidly grown, these classical docking algorithms have created a significant computational bottleneck. We address this problem by introducing Deep Surrogate Docking (DSD), a framework that applies deep learning-based surrogate modeling to accelerate the docking process substantially. DSD can be interpreted as a formalism of several earlier surrogate prefiltering techniques, adding novel metrics and practical training practices. Specifically, we show that graph neural networks (GNNs) can serve as fast and accurate estimators of classical docking algorithms. Additionally, we introduce FiLMv2, a novel GNN architecture which we show outperforms existing state-of-the-art GNN architectures, attaining more accurate and stable performance by allowing the model to filter out irrelevant information from data more efficiently. Through extensive experimentation and analysis, we show that the DSD workflow combined with the FiLMv2 architecture provides a 9.496x speedup in molecule screening with a <3% recall error rate on an example docking task. Our open-source code is available at https://github.com/ryienh/graph-dock.
translated by 谷歌翻译
近年来,由于海洋漏油事故严重影响环境,自然资源和沿海居民的生活,近年来,漏油事件引起了人们的关注。高光谱遥感图像提供了丰富的光谱信息,这对在复杂的海洋场景中监测漏油物有益。但是,大多数现有方法都是基于受监督和半监督的框架来检测高光谱图像(HSIS)的漏油事件,这些框架需要大量努力来注释一定数量的高质量训练集。在这项研究中,我们首次尝试基于HSIS的隔离森林开发无监督的漏油检测方法。首先,考虑到噪声水平在不同的频段之间有所不同,因此利用了噪声方差估计方法来评估不同频段的噪声水平,并且消除了因严重噪声而损坏的频段。其次,使用内核主成分分析(KPCA)来降低HSIS的高维度。然后,用隔离林估计属于海水和油泄漏之一的每个像素的概率,并且使用群集算法在检测到的概率上自动生产一组伪标记的训练样品。最后,可以通过在减少尺寸的数据上执行支持向量机(SVM)来获得初始检测图,然后,使用扩展的随机Walker(ERW)模型进一步优化初始检测结果,以改善检测检测漏油的准确性。关于我们自己创建的空气传播高光谱漏油数据(HOSD)的实验表明,该方法在其他最先进的检测方法方面获得了卓越的检测性能。
translated by 谷歌翻译
在深度学习时代,注释的数据集已成为遥感社区的关键资产。在过去的十年中,发表了许多不同的数据集,每个数据集都为特定的数据类型以及特定的任务或应用程序设计。在遥感数据集的丛林中,很难跟踪已经可用的内容。在本文中,我们介绍了EOD -IEEE GRSS地球观察数据库(EOD) - 一个交互式在线平台,用于分类不同类型的数据集利用遥感图像。
translated by 谷歌翻译